Thematic Minireview Series on Results from the ENCODE Project: Integrative Global Analyses of Regulatory Regions in the Human Genome
نویسنده
چکیده
The Encyclopedia of DNA Elements (ENCODE) Project (http://www.genome.gov/10005107) is an international collaboration of research groups funded by the National Human Genome Research Institute, with the goal of delineating all functional elements encoded in the human genome (1). This project began in 2003with a targeted analysis of a selected 1%of the human genome. The results from the pilot project were published in 2007 (2), and a second phase of funding was then provided to scale the project to the entire human genome. Genome-scale projects in ENCODE involve the identification and quantification of RNA species in whole cells and subcellular compartments, mapping of noncoding and protein-coding genes bymanual review and experimentalmethods, delineation of chromatin and DNA accessibility, mapping of histone modifications and transcription factor-binding sites by ChIP, and measurement of DNA methylation. More recently, ENCODE has adopted additional approaches that have not yet resulted in extensive data sets, including the examination of long-range chromatin interactions, analysis of RNA-binding proteins, and validation of transcriptional enhancers and silencers. To date, 2000 data sets have been deposited for public use by the ENCODE Project at the University of California Santa Cruz (UCSC) Genome Browser (3); to encourage public use of the data sets, a “user’s guide” to the ENCODE data sets has been published (4). As the second phase of the ENCODEProject nears completion, the ENCODE Consortium has prepared a large integrativemanuscript that includes analyses of experiments from147 cell typesandprovidesasummaryof their functional annotationof thehumangenome(5).Additionally, othermorenarrowly focused studies on subsets of ENCODE data have been or will soon be published; for a list of ENCODE publications, see the ENCODE tab at the UCSCGenome Bioinformatics site. Many new insights concerning the organization and function of genomic elements have come from the ENCODE Project, including the findings that most transcription factors have many thousands of binding sites in the human genome and that these binding sites are distributed non-randomly, with only approximately one-third being located near a transcription start site (5).Many of these distally located regions of transcription factor-binding sites are thought to be transcriptional enhancers. Because enhancers are far from genes, can work in either orientation, and can sometimes skip over the nearest gene, in the past, they have been difficult to characterize. However, the study of enhancers has gained enormous momentum from high throughput methods such as ChIP-seq and comprehensive analyses from genomic projects such as ENCODE and the Roadmap Epigenomics Project. Based on current estimates of up to 50,000 enhancers in any given cell type and the fact that enhancers tend to be cell type-specific, it has been estimated that there are perhaps 10–10 enhancers in the human genome. Studies indicate that the majority of enhancers are composed of transcription factor-binding sites residing within nucleosome-free regions flanked by specific patterns of histone modifications (Fig. 1). The minireview entitled “Chromatin Fingerprint of Gene Enhancer Elements” by Gabriel E. Zentner and Peter C. Scacheri reviews the types of variant and modified histones and histone-modifying complexes found at enhancers and describes subclasses of active and poised enhancers. An emerging concept is that transcription factors bound to distal enhancer elements regulate genes by looping out the intervening DNA and interacting with other factors bound at promoter regions that can be tens to hundreds of kilobases away. It is becoming clear that not only are chromatin-remodeling complexes required to achieve and maintain the nucleosome-free regions of enhancers that are bound by the site-specific factors but that they are also involved in the formation of chromosomal loops. One such chromatin-remodeling complex is SWI/SNF, a DNA-dependent ATPase. Human SWI/SNF complexes contain 10–12 subunits, many of which have alternative forms encoded by different members of gene families, resulting in many different possible SWI/SNF complexes. Components of SWI/SNF have specific protein domains that can recognize the acetylated or methylated histones that are found at enhancer regions, thus providing anchoring to nucleosomes. SWI/SNF can also interact with a variety of site-specific DNA-binding transcription factors. The ability of the SWI/SNF complex to interact with both DNA-bound factors and nucleosomes may contribute to its ability to form or stabilize chromosomal loops. As described in the minireview by Ghia Euskirchen, Raymond K. Auerbach, and Michael Snyder entitled “SWI/SNF Chromatin-remodeling Factors: Multiscale Analyses andDiverse Functions,” changes in abundance, structure, or activity of different components can alter the function of SWI/SNF in different types of normal or diseased cells. In a recent genome-wide study of SWI/SNF components, it was found that many of the binding sites are also bound by the Author’s Choice—Final version full access. 1 To whom correspondence should be addressed. E-mail: pfarnham@ usc.edu. THE JOURNAL OF BIOLOGICAL CHEMISTRY VOL. 287, NO. 37, pp. 30885–30887, September 7, 2012 Author’s Choice © 2012 by The American Society for Biochemistry and Molecular Biology, Inc. Published in the U.S.A.
منابع مشابه
Integrative annotation of chromatin elements from ENCODE data
The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate a...
متن کاملDecoding the human genome.
Interpreting the human genome sequence is one of the major scientific endeavors of our time. In February 2001, when the human genome reference sequence was initially released (Lander et al. 2001), our understanding of the encoded contents was surprisingly limited. It was perplexing to many in the scientific community when we realized that the human genome contains only ;21,000 distinct protein-...
متن کاملAn integrated encyclopedia of DNA elements in the human genome
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in par...
متن کاملAn integrated encyclopedia of DNA elements in the human genome
The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, ch...
متن کاملPredicting functional regulatory polymorphisms
MOTIVATION Limited availability of data has hindered the development of algorithms that can identify functionally meaningful regulatory single nucleotide polymorphisms (rSNPs). Given the large number of common polymorphisms known to reside in the human genome, the identification of functional rSNPs via laboratory assays will be costly and time-consuming. Therefore appropriate bioinformatics str...
متن کامل